Algorithms for Boolean Function Query 
Properties 



Scott Aaronson* 

Computer Science Division, UC Berkeley 
Berkeley, CA USA 94720-1776 
£—». ' aaronson@cs.berkeley.edu 

o' 

Abstract. We present new algorithms to compute fundamental proper- 
ties of a Boolean function given in truth-table form. Specifically, we give 
an 0(N' 2/i22 log N) algorithm for block sensitivity, an 0(iV 11585 log N) al- 
gorithm for 'tree decomposition,' and an O(N) algorithm for 'quasisym- 
metry.' These algorithms are based on new insights into the structure 
[^J ' of Boolean functions that may be of independent interest. We also give 

f) , a subexponential-time algorithm for the space-bounded quantum query 

complexity of a Boolean function. To prove this algorithm correct, we 
«\ develop a theory of limited-precision representation of unitary operators, 

building on work of Bernstein and Vazirani. 

Keywords: algorithm; Boolean function; truth table; query complexity; 
quantum computation. 
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1 Introduction 

C \ The query complexity of Boolean functions, also called black-box or decision-tree 

complexity, has been well studied for years P,p| Jl^ , |l^ , |li|[l7| . Numerous Boolean 
function properties relevant to query complexity have been defined, such as 
sensitivity, block sensitivity, randomized and quantum query complexity, and 
degree as a real polynomial. But many open questions remain concerning the 
relationships between the properties. For example, are sensitivity and block 
sensitivity polynomially related? How small can quantum query complexity be, 
relative to randomized query complexity? Lacking answers to these questions, 
we may wish to gain insight into them by using computer analysis of small 
Boolean functions. But to perform such analysis, we need efficient algorithms 
to compute the properties in question. Such algorithms are the subject of the 
present paper. 

Let / : {0, 1}™ — > {0, 1} be a Boolean function, and let N = 2 n be the size 
of the truth table of /. We seek algorithms that have modest running time as 
a function of N, given the truth table as input. The following table lists some 
properties important for query complexity, together with the complexities of the 
most efficient algorithms for them of which we know. In the table, 'LP' stands 
for linear programming reduction. 

* Supported by an NSF Graduate Fellowship. Work done at Bell Laboratories / Lucent 
Technologies. 



Query Property 

Deterministic query complexity D(f) 
Certificate complexity C(f) 
Degree as a real polynomial deg(/) 
Approximate degree deg(/) 
Randomized query complexity Ro(f) 
Block sensitivity bs(/) 
Quasisymmetry 
Tree decomposition 
Quantum query complexity Q2U) 
Qi{f) with 0(log n)-qubit restriction 



Complexity 

0(N 1585 log N) 
OiN 1 - 585 log N) 
0(N 1585 log N) 
About 0(N 5 ) 
About 0(N 7 - 925 
0(N 2322 log N) 
O(N) 

0(N 1 - 585 log N) 
Exponential 
0(7V polylog ( JV )) 



Source 

1 
II 
This paper 

Obvious (LP) 
Obvious (LP) 
This paper 
This paper 
This paper 
Obvious 
This paper 



There is also a complexity-theory rationale for studying algorithmic problems 
such as those considered in this paper. Much effort has been devoted to finding 
Boolean function properties that do not naturalize in the sense of Razborov and 
Rudich fig] , and that might therefore be useful for proving circuit lower bounds. 
In our view, it would help this effort to have a better general understanding of 
the complexity of problems on Boolean function truth tables — both upper and 
lower bounds. Such problems have been considered since the 1950's [Q, but 
basic open questions remain, especially in the setting of circuit complexity |ll| . 
This paper addresses the much simpler setting of query complexity. 

We do not know of a polynomial-time algorithm to find quantum query 
complexity; we raise this as an open problem. However, even finding quantum 
query complexity via exhaustive search is nontrivial, since it involves represent- 
ing unitary operators with limited-precision arithmetic. The problem is deeper 
than that of approximating unitary gates with bounded error, which was solved 
by Bernstein and Vazirani j4|. In Section 7 we resolve the problem, and give 
an 0(N polyl ° e ^ N >) constant-factor approximation algorithm for bounded-error 
quantum query complexity if the memory of the quantum computer is restricted 
to O(logn) qubits. 

We have implemented some of the algorithms discussed in this paper in a 
linkable C library |IJ], which is available for download. 

2 Preliminaries 

A Boolean function / is a total function from {0, 1}™ onto {0, 1}. We use Vf 
to denote the set of variables of /, and use X, or alternatively x\ 1 . . . ,x n , to 
denote an input to /. If X is an input, \X\ denotes the Hamming weight of X; 
if S is a set, l^l denotes the cardinality of S. Particular Boolean functions to 
which we refer are AND„, OR„, and XOR„, the AND, OR, and XOR functions 
respectively on n inputs. 



3 Previous Work 

To our knowledge, no algorithms for block sensitivity, quasisymmetry, tree de- 
composition, or quantum query complexity have been previously published. But 
algorithms for simpler query properties have appeared in the literature. 



Given a Boolean function /, the deterministic query complexity D(f) is the 
minimum height of a decision tree representing /. Guijarro et al. M give a simple 
0(N 1585 log A) dynamic programming algorithm to compute D(f). That / is 
given as a truth table is crucial: if / is non-total and only the inputs for which 
/ is defined are given, then computing D(f) is (when phrased as a decision 
problem) NP-complete JE(i . 

The certificate complexity C(f) is the maximum, over all inputs X, of the 
minimum number of input bits needed to prove the value of f(X). Equivalently, 
C(/) is the minimum height of a nondeterministic decision tree for /. Czort []6| 
gives an 0(A 1585 log A) algorithm to compute C(f). Again, if / is not given 
as a full truth table, then computing C(f) is NP-complete M. 

Let dcg(/) be the minimum degree of an rx-variate real multilinear polyno- 
mial p such that, for all X <G {0,1}", p(X) — f(X). The following lemma, 
adapted from Lemma 4 of H, is easily seen to yield an 0(A 1585 log N) dynamic 
programming algorithm for deg(/). Say that a function obeys the parity prop- 
erty if the number of inputs X with odd parity for which f(X) = 1 equals the 
number of inputs X with even parity for which f(X) = 1. 

Lemma 1 (Shi and Yao). deg(/) equals the size of the largest restriction of 
f for which the parity property fails. 

4 Block Sensitivity 

Block sensitivity, introduced in [ |13| , is a Boolean function property that is used 
to establish lower bounds. There are several open problems that an efficient 
algorithm for block sensitivity might help to investigate Jl3|j3p[|. 

Let X be an input to Boolean function /, and let B (a block) be a nonempty 
subset of Vf. Let X(B) be the input obtained from X by flipping the bits of B. 

Definition 1. A block B is sensitive on X if f(X) ^ f(X(B)), and minimal 
on X if B is sensitive and no proper sub-block S of B is sensitive. Then the 
block sensitivity bsxif) of X is the maximum number of disjoint minimal (or 
equivalently, sensitive) blocks on X. Finally bs(/) is the maximum ofbsx(f) 
over all X . 

The obvious algorithm to compute bs(/) (compute bsx(f) for each X us- 
ing dynamic programming, then take the maximum) uses 6>(A 2585 log N) time. 
Here we show how to reduce the complexity to 0( A 2322 log N) by exploiting the 
structure of minimal blocks. Our algorithm has two main stages: one to identify 
minimal blocks and store them for fast lookup, another to compute bsx(f) for 
each X using only minimal blocks. The analysis proceeds by showing that no 
Boolean function has too many minimal blocks, and therefore that if the algo- 
rithm is slow for some inputs (because of an abundance of minimal blocks), then 
it must be faster for other inputs. 

Algorithm 1 (computes bs(f)) For each input X : 
1. Identify all sensitive blocks of X; place them in an AVL tree T. 



2. Loop over all sensitive blocks B in T in lexicographic order ({xi} , {2:2} j 
{xi,X%} , {#3} , . . .). For each block B, loop over all 2"~' B — 1 possible 
blocks C that properly contain B. Remove from T all such blocks C that are 
in the tree; such blocks have been identified as non-minimal. 

3. Create 2™ — 1 lists, one list L$ for each nonempty subset S of variables. 
Then, for each minimal block B in T , insert a copy of B into each list L$ 
such that B C S . The result is that, for each S , L$ — 2 s n T, where 2 s is 
the power set of S. 

4- Let a state be a partition (P, Q) of Vf. The set P represents a union of 
disjoint minimal blocks that have already been selected; the set Q represents 
the set of variables not yet selected. Then bsx(f) — (0, Vf), where 9 (P, Q) 

is defined via the recursion 6 (P, Q) = 1 + maxseig 9 (P U B , Q — B) . Here 
the maximum evaluates to if Lq is empty. Compute (P, Q) using depth- f 
irst recursion, caching the values of 9 (P, Q) so that each needs to be computed 
only once. 

The block sensitivity is then the maximum of bsx(f) over all X. 

Let m(X, k) be the number of minimal blocks of X of size k. The analysis of 
Algorithm fy's running time depends on the following lemma, which shows that 
large minimal blocks are rare in any Boolean function. 

Lemma 2. J2x m ( x > k ) < 2 n ~ fc+1 (fe)- 

Proof. The number of positions that can be occupied by a minimal block of 
size k is (?) for each input, or 2 n (?) for all inputs. Consider an input X with 
a minimal block B — {bi, ... , bk} of size k > 1. Block B has 2 k — 1 nonempty 
subsets; label them Si, . . . , S 2 k_i- By the minimality of B, for each Si the input 
X(Si) has {61} , . . • , {bk} as minimal blocks if Si — B, and B — Si as a minimal 
block if Si 7^ B. Therefore X(Si) cannot have B as a minimal block. So of the 
2 n (?) positions, only one out of 2 k can be occupied by a minimal block of size k. 
When k — 1 an additional factor of 2 is needed, since X(B) has B as a minimal 
block. ■ 

Theorem 1. Algorithm^ takes 0(N 2 - 322 log N) time. 

Proof. Step 1 takes time 0(N 2 \ogN), totaled over all inputs. Let us analyze 
step 2, which identifies the minimal blocks. For each input X, every block B 
that is selected is minimal, since each non-minimal block in A was removed in a 
previous iteration. Furthermore, for each block B the number of removals of C 
blocks is less than 2™~l s l. Therefore the total number of removals is at most 

ExE n k =oMx,k)2- k = ELo [z n - k ExMx,k)] < E n k = ^ 2n - 2k+1 (I) 

which sums to 2N loS25 . Since each removal takes O (log TV) time, the total time 
isO(7V 2 - 322 logiV). 

We next analyze step 3, which creates the 2™ — 1 lists L$. Since each minimal 
block B is contained in 2™ _ l s l sets of variables, the total number of insertions is 



at most Efc=o TO ( X ' fc ) 2 "~ fc for in Put X. So the time is O (TV 2 - 322 log AT) by the 
previous calculation. 

Finally we analyze step 4, which computes block sensitivity using the minimal 
blocks. Each 9 (P, Q) evaluation is performed at most once, and involves looping 
through a list of minimal blocks contained in Q, with each iteration taking 
0(log N) time. For each block B, the number of distinct (P, Q) pairs such that 
B C Q is at most 2™~I S L Therefore, again, the time for each input X is at most 
(log N)J2l =0 m{X, k)2 n ~ k and a bound of O (iV 2 - 322 log A) follows. ■ 

5 Quasisymmetry 

A Boolean function f(X) is symmetric if its output depends only on \X\. Query 
complexity is well understood for symmetric functions: for example, for all non- 
constant symmetric /, the deterministic query complexity is n and the zero-error 
quantum query complexity is (n) p|. Thus, a program for analyzing Boolean 
functions might first check whether a function is symmetric, and if it is, dispense 
with many expensive tests. We call / quasisymmetric if some subset of input 
bits can be negated to make / symmetric. For example, / = OR^i, 1 ^) is 
quasisymmetric but not symmetric. There is an obvious 0(N 2 ) algorithm to 
test quasisymmetry; here we sketch a linear-time algorithm. 

Call a restriction of / a z- left-restriction if each variable Vi is fixed if and only 
if i < z. Our algorithm recurses through all z-left- restrictions: when it is called 
on restriction R, it calls itself recursively on R x , +1 =o and R x , +1 =i- If either 
of these is not quasisymmetric, then the algorithm returns failure; otherwise, 
the algorithm tries to fit the restrictions together in such a way that R itself 
is seen to be quasisymmetric. It does this by testing whether R Xz+1 =o (\X\) = 
Rx z+1 =i (\X\ ± 1), with separate routines for the special cases in which R Xz+1= q 
or R Xz+1= i is a constant function or a XOR or 1 — XOR function. If the f 
itting-together process succeeds, then the algorithm returns both the output of 
R (encoded in compact form, as a symmetric function) and the set of input bits 
that must be flipped to make R symmetric. Crucially, these return values occupy 
only 0(n — z) bits of space. The algorithm also has subroutines to handle the 
special cases in which R is a XOR function or a constant function. In these cases 
R is symmetric no matter which set of input bits is flipped. Since the time used 
by each invocation is linear in n—z, the total time used is X)z=o^( n ~ z ) = O(N). 
The following lemma shows that the algorithm deals with all of the ways in which 
a function can be quasisymmetric, which is key to the algorithm's correctness. 

Lemma 3. Let f be a Boolean function on n inputs. If two distinct (and non- 
complementary) sets of input bits A and B can be flipped to make f symmetric, 
then f is either XOR n , 1 — XOR„, or a constant function. 

Proof. Assume without loss of generality that B is empty. Then A has cardi- 
nality less than n. We know that f(X) depends only on \X\, and also that it 

depends only on / X / — Y^i=i K ( x i) where k(x) = 1 — x if X{ € A and k(x) = x 
otherwise. Choose any Hamming weight < w < n — 2, and consider an input 
Y with \Y\ — w and with two variables Vi and Vj such that Vi € A, Vj ^ A, 



and Y(i) = Y(j) = 0. Let Z = Y Y{l)=h y iJ)=1 . We have \Z\ = \Y\ + 2, but 
on the other hand / Z / = jY / ', so f(Y) = f{Z) by symmetry. Again applying 
symmetry, f(P) = f(Q) whenever \P\ = w and \Q\ = w + 2. Therefore / is 
either XOR n , 1 — XOR„, or a constant function. ■ 

6 Tree Decomposition 

Many of the Boolean functions of most interest to query complexity are naturally 
thought of as trees of smaller Boolean functions: for example, AND-OR trees and 
majority trees. Thus, given a function /, one of the most basic questions we 
might ask is whether it has a tree decomposition and if so what it is. In this 
section we define a sense in which every Boolean function has a unique tree 
decomposition, and we prove its uniqueness. We also sketch an 0(7V 1585 log AT) 
algorithm for finding the decomposition. 

Definition 2. A distinct variable tree is a tree in which 

(i) Every leaf vertex is labeled with a distinct variable, 
(ii) Every non-leaf vertex v is labeled with a Boolean function having as many 

variables as v has children, and depending on all of its variables. 
(Hi) Every non-leaf vertex has at least two children. 

Such a tree represents a Boolean function in the obvious way. We call the tree 
trivial if it contains exactly one vertex. 

A tree decomposition of / is a separation of / into the smallest possible 
components, with the exception of ( n )ANDfc, ( n )ORfc, and ( n )XORfe compo- 
nents (where ( n ) denotes possible negation), which are left intact. The choice of 
AND, OR, and XOR components is not arbitrary; these are precisely the three 
components that "associate," so that, for example, AND (xi, AND (#2, £3)) = 
AND (AND (xi, £2), £3). Formally: 

Definition 3. A tree decomposition of f is a distinct variable tree representing 
f such that: 

(i) No vertex is labeled with a function f that can be represented by a nontrivial 

tree, unless f is ( n )ANDfe 7 ( n )ORfc, or ( n ) XORfc for some k. 
(ii) No vertex labeled with ( n ) AND& has a child labeled with AND;. 
(Hi) No vertex labeled with ( n ) ORfc has a child labeled with OR;. 
(iv) No vertex labeled with ( n ) XORfe has a child labeled with ( n )XOR(. 
(v) Any vertex labeled with a function that is constant on all but one input is 
labeled with ( n ) AND fc or ( n ) OR fe . 

Let double-negation be the operation of negating the output of a function at 
some non-root vertex v, then negating the corresponding input of the function at 
w's parent. Double-negation is a trivial way to obtain distinct decompositions. 
This caveat aside, we can assert uniqueness: 

Theorem 2. Every Boolean function has a unique tree decomposition, up to 
double-negation. 



Proof. Given a vertex v of a distinct variable tree, let L(y) be the set of 
variables in the subtree of which v is the root. Assume that / is represented 
by two distinct tree decompositions, T and Z, such that T has a vertex vt and 
Z has a vertex vz with L(vt) and L(vz) incomparable (i.e. they intersect, but 
neither contains the other). Then let A = L(vt) — L(vz), B = L(i>z) — L(vt), 
I = L(i>t) H L(vz), and £/ = Vf — L(vt) — L(i>z)- The crucial lemma is the 
following. 

Lemma 4. / is a function oft (A), i (I), z (B), and U , for some Boolean func- 
tions t, i, and z. 

Proof. We can write T as T\u [tAi (A, I) ,B], where tAi is Boolean; simi- 
larly we can write Z as Z\y [A, zib (I, B)]. We have that, for all settings of U, 
T\u [tAi (A, I) ,B] = Z\u [A, zib (I, B)]. Consider a restriction that fixes all the 
variables in B. This yields Tj[/,B \pAi {A,I)\ — Z\u [A,z\b (-0] ■ Therefore, for 
all restrictions of B, tAi depends on only a single bit obtained from /, namely 
Zib (-0- So we can write tAi (A, I) as t a (A, z\b (I)) for some Boolean t a — or even 
more strongly as t a {A, tp (I)), since we know that tAi does not depend on B. By 
analogous reasoning we can write Zib (I, B) as z a (zp (I) , B) for some functions 
z a and zp. So we have T\u [t a (A, tp (I)) , B] = Z\ v [A, z a (zp (I) , B)] . Next we 
restrict AU B, obtaining T\ U<B [t a \ A (*/3 (I))] = Z\u,A [z a \B ( Z P (I))] , which im- 
plies that, for some functions Tl v B and Z! v A , T! v B [tp (I)] — Z! v A [zp (I)] . This 
shows that tp (I) and zp (I) are equivalent up to negation of output, since T and 
Z must depend on I for some restriction of AuB. So we have T\u [U(i) (A) , B] = 
Z\u [A, Ziin {B)] . for some Boolean functions i(I) (henceforth simply i), ti, and 
Zi (i G {0, 1}). Next we restrict A and i: T\u_aa [B] — Z\ Ut A [zt\i (-B)l • Thus, for 
all restrictions of A and i, T depends on only a single bit obtained from B, which 
we'll call zi (B) (and which can be taken equal to zm (B)). Note that % does 
not depend on A. Analogously, for both possible restrictions of i, Z depends 
on only a single bit obtained from A, which we'll call U (A). So we can write 
T t \u [U (A) , Zi (B)] = Z^u [ti (A) , Zi (B)] where T^u and Z^ v are two-input 
Boolean functions. We claim that Zq = Z\ and to = t\. 

There must exist a setting u of U such that T t \u— U depends on both ti and 
Zi. Suppose there exists a setting b of B such that zq (b) ^ z\ (b). U must be 
a nonconstant function, so find a constant c such that T if \rj— U [c, % (B)\ depends 
on Zi, and choose a setting for A and i such that t% (A) = c. (If T*iu =u is a XOR 
function, then either c = or c = 1 will work, whereas if T te \ u=u is an AND or 
OR function, then only one value of c will work.) For T te \ u=u to be well-defined, 
we need that whenever ti (A) = c, the value of i is determined (since % has no 
access to i). This implies that ti has the form t (A) A i or t (A) A n i for some 
function t. Therefore T^u =u can be written as T tiz \u= u [t (A) , i, % (B)] for some 
function T tiz \ u=u . 

Now repeat the argument for Z :t <u =u . We obtain that Z te \ u=u can be writ- 
ten as Z tiz \rj— U [ti (A) , i , z (B)] for some functions Z tiz iu =u .a.nd z. Therefore 
T tiz \u=u [t(A) ,1,% (B)] = Z tiz \ u=u [U (A) ,i,z(B)] . So we can take % (B) = 



z(B) (equivalently %{A) = t(A)), and write T tlz \ u=u (or Z Uz \ u=u ) as 
T tizlu=u [t(A),i(I),z{B)}. m 

We now prove the main theorem: that / has a unique tree decomposition, 
up to double-negation. From Lemma 0, vt effectively has as inputs the two 
bits t(A) and i(I), and vz the two bits i(I) and z(B). Thus we can check, by 
enumeration, that either vt and vz are labeled with the same function, and that 
function is either ANDfe, n ANDfe, ORfc, or n OR^; or vp and vz are both labeled 
with either XORfe or n XORfe. (Note that k can be different for vp and for vz) 

In either case, for all u there exists a function T tiz \u =u , taking t{A), i{I), 
and z (B) as input, that captures all that needs to be known about AU I U B. 
Furthermore, since tAi and zjp do not depend on u, neither does T tiz i u=u , and 
we can write it as T t i Z . Let v m be the unique vertex in T such that L{v m ) 
contains A(J I U B and is minimal among all L (vi) sets that do so. If v m is 
labeled with ( n ) ANDfe, ( n )ORfc, or ( n ) XORfe, then vp cannot be a vertex of 
T. If v m is labeled with some other function, then L (v m ) 7^ AU I U B and 
the function at v m is represented by a nontrivial tree. Either way we obtain a 
contradiction. 

Now that we have ruled out the possibility of incomparable subtrees, we can 
establish uniqueness. Call a set V C Vf unifiable if there exists a vertex u„, in 
some decomposition of /, such that L (u„) = V. Let C be the collection of all 
unifiable sets. We have established that no pair Vi, V2 G C is incomparable: 
either V\ fl V 2 = <j), V\ C V 2 , or V 2 C V\. We claim that any decomposition 
must contain a vertex V{ with L (w^) = Vi for every V^ S C. For suppose that 
V^ is not represented in some decomposition F. Certainly Vi ^ Vf, so let Vp 
be the parent set of Vi in i 71 : that is, the unique minimal set such that Vi C Vp 
and there exists a vertex vp in i 7, with i (vp) = Vp. Then the function at vp is 
represented by a nontrivial tree, containing a vertex Vi with L (i>i) = Vi — were 
it not, then Vi could not be a vertex in any decomposition. Furthermore, the 
function at vp cannot be ( n ) ANDfc, ( n ) ORfc, or ( n ) XORfc. If it were, then again 
Vi could not be a vertex in any decomposition, since it would need to be labeled 
correspondingly with ( n ) AND&, ( n ) ORfc, or ( n ) XORfe. Having determined the 
unique set of vertices that comprise any tree decomposition, the vertices' labels 
are also determined up to double-negation. ■ 

We now sketch an algorithm to construct the tree decomposition. In a dis- 
tinct variable tree, let L(v) be the set of variables in the subtree of which v is the 
root. Then given a subset G of Vf, we can clearly decide in linear time whether 
a distinct variable tree representing / could have a vertex u with L(u) — G. So 
we can construct a decomposition in O (N 2 ) time, by checking whether a vertex 
u could have L(u) = G for each subset G C Vf satisfying 2 < \G\ < n — 1. 

The key insight for reducing the time to 0(N 1 - 5S5 log N) is to represent each 
restriction by a concise codeword, which takes up only O (n) bits rather than 2' > 
bits. We create the codewords recursively, starting with the smallest restrictions 
and working up to larger ones. The codewords need to satisfy the following 
conditions: 



— Two restrictions g and h over the same set of variables get mapped to iden- 
tical codewords if and only if g = h. 

— If g is the negation of h, then this fact is easy to tell given the codewords of 
g and h. 

— If a restriction is constant, then this fact is also easy to tell given its code- 
word. 

We can satisfy this condition by building up a binary tree of restrictions at each 
recursive call, then assigning each restriction a codeword based on its position 
in the tree. For all G ^ 0, each object inserted into the tree is a concatenation 
of two codewords of size-(|G| — 1) restrictions. 

After the codewords are created, a second phase of the algorithm deletes 
redundant ANDfc, OR*, and ( n )XORfe vertices. This phase looks for vertices u 
and v with L (u) and L (v) incomparable, which, as a consequence of Theorem 
||, can only have arisen by AND^, OR&, or ( n )XORfc. Both phases effectively 
perform an O (log N)-time operation for all subsets of subsets of Vf , so the 
complexity is 0(N 1585 log N). 

7 Quantum Query Complexity 

The quantum query complexity of a Boolean function / is the minimum number 
of oracle queries needed by a quantum computer to evaluate /. Here we are 
concerned only with the bounded-error query complexity Q2 (f) (defined in |3]]), 
since approximating unitary matrices with finite precision introduces bounded 
error into any quantum algorithm. A quantum query algorithm r proceeds by 
an alternating sequence of T + 1 unitary transformations and T query transfor- 
mations: Uq — » Q\ — > U\ — ► • • • — ► Qt — > Ut- Then Qi(f) is the minimum of T 
over all r that compute / with bounded error. 

There are several open problems that an efficient algorithm to compute Qi{f) 
might help to investigate ||,||,£|. Unfortunately, we do not know of such an 
algorithm. Here we show that, if we limit the number of qubits, we can obtain 
a subexponential-time approximation algorithm via careful exhaustive search. 

7.1 Overview of Result 

For what follows, it will be convenient to extend the quantum oracle model 
to allow intermediate observations. With an unlimited workspace, this cannot 
change the number of queries needed Q. In the space-bounded setting, however, 
it might make a difference. 

We define a composite algorithm F' to be an alternating sequence Ji — > 
D\ — > ■ ■ ■ — > r t — > D t . Each JJ is a quantum query algorithm that uses Tj 
queries and at most m qubits of memory for some m > log 2 n + 2. When J"j 
terminates a basis state \ipi) is observed. Each D t is a decision point, which 
takes as input the sequence \ipi) , ■ ■ ■ ,\ipt)i an d as output decides whether to 
(1) halt and return / = 0, (2) halt and return / = 1, or (3) continue to r t +\. 
(The final decision point, D t , must select between (1) and (2).) There are no 
computational restrictions placed on the decision points. However, a decision 
point cannot modify the quantum algorithms that come later in the sequence; it 



can only decide whether to continue with the sequence. For a particular input, let 
Pk be the probability, over all runs of J", that quantum algorithm !& is invoked. 
Then r' uses a total number of queries J2k=iPkTk- 

We define the space-bounded quantum query complexity SQ2, m {f) to be the 
minimum number of queries used by any composite algorithm that computes / 
with error probability at most 1/3 and that is restricted to m qubits. We give 
an approximation algorithm for SQ2, m (f) taking time 2°( 4 mn \ which when 
m = (logn) is 0(iV polylos W). The approximation ratio is ^22/3 + e for any 
e > 0. The difficulty in proving the result is as follows. 

A unitary transformation is represented by a continuous- valued matrix, which 
might suggest that the quantum model of computation is analog rather than 
digital. But Bernstein and Vazirani W] showed that, for a quantum computation 
taking T steps, the matrix entries need to be accurate only to within O(logT) 
bits of precision in the bounded-error model. However, when we try represent 
unitary transformations on a computer with finite precision, a new problem 
arises. On the one hand, if we allow only matrices that are exactly unitary, 
we may not be able to approximate every unitary matrix. So we also need to 
admit matrices that are almost unitary. For example, we might admit a matrix 
if the norm of each row is sufficiently close to 1, and if the inner product of 
each pair of distinct rows is sufficiently close to 0. But how do we know that 
every such matrix is close to some actual unitary matrix? If it is not, then 
the transformation it represents cannot even approximately be realized by a 
quantum computer. 

We resolve this issue as follows. First, we show that every almost-unitary 
matrix is close to some unitary matrix in a standard metric. Second, we show 
that every unitary matrix is close to some almost-unitary matrix representable 
with limited precision. Third, we upper-bound the precision that suffices for a 
quantum algorithm, given a fixed accuracy that the algorithm needs to attain. 

An alternative approach to approximating SQ2, m (f) would be to represent 
each unitary matrix as a product of elementary gates. Kitaev |l2f| and indepen- 
dently Solovay |Lq] showed that a 2 m x 2'™ unitary matrix can be represented 
with arbitrary accuracy S > by a product of 2°( rn ' polylog ( 1 /' 5 ) unitary gates. 
But this yields a 2 2 algorithm, which is slower than ours. Per- 

haps the construction or its analysis can be improved; in any case, though, this 
approach is not as natural for the setting of query complexity. 

7.2 Almost-Unitary Matrices 

Let u • v denote the conjugate inner product of u and v. The distance \A — B\ 
between matrices A — (ay) and B = (fry) in the L max norm is defined to be 
maxjj |ay - by I- 
Definition 4. A matrix A is q-almost-umtary if \I — AAJ\ < q. 

In the following lemma, we start an almost-unitary matrix A and construct an 
actual unitary matrix U that is close to A in the L max norm. 

Lemma 5. Let A be a q- almost-unitary s x s matrix, with s > 2 and q < \/As. 
Then there exists a unitary matrix U such that \A — U\ < A.91q^/s. 



Proof. We first normalize each row Ai so that Ai • Ai = 1. For each entry a. 



y 



(i 



^/(^•Ai)-Oij| = |oij||l-(Ai«Ai)|/|Ai«Ai| <g(H-g)/(l-g). We next 
form a unitary matrix i? from A by using the Classical Gram-Schmidt (CGS) 
orthogonalization procedure (see M for details). The idea is to project A 2 to 
make it orthogonal to A\, then project A3 to make it orthogonal to both A\ and 
A 2 , and so on. Initially we set Bj <— A\. Then for each 2 < i < s, we set _Bj «— 
^-E5=i(-^ • B j) B v Therefore A t .B k = (^•A fc )-^" 1 1 (^ • B,)(A fc • B,-)- 
We need to show that the discrepancy between A and B does not increase 
too drastically as the recursion proceeds. Let a k — max^ Ai • B k . By hypothesis 



a\ < q. Then o^ < o\ + Ylj=i a j- Assume that au < q + 4q 2 s for all k < K. By 

induction, <tk+i < q + K (q + 4g 2 s) < q + Aq 2 s since q < l/4s and if < s. So 
for all k, at < q + ^q 2 s. 

Let = \A — B\. By the definition of B, <\> < a^ \w\\ + ■ ■ ■ + a s \w s \ where 
w is a column of B. Since \w\\ + ■ • ■ + \w s \ = 1, is maximized when Wi = 

<Ti\fs/(vi H ho- s ),or</> < cr 2 H h<r 2 \/s /VH ^) < (g + 4g 2 s) \/s/g. 

Adding g(l + g)/(l — g) from normalization yields a quantity less than 
(4 + 9v / 2/14) gv^ ~ 4.91g-y/s. This can be seen by working out the arithmetic 
for the worst case of s = 2, q = l/4s. ■ 

The next lemma, which is similar to Lemma 6.1.3 of H, is a sort of converse 
to Lemma pc we start with an arbitrary unitary matrix, and show that truncating 
its entries to a precision 8 > produces an almost-unitary matrix. 

Lemma 6. Let U and V be s x s matrices with s > 2 and \U — V\ < 8. If U is 
unitary, then V is (28^/s + S 2 s) -almost-unitary. 

Proof. First, U l »U i = Y,t=i \ u k + 7fc| 2 = 1+ZwUi ( u klt + "fc7fc + 7fc7fc) where 
the life's are entries of U and the 7fc's are error terms satisfying \jk\ < 8. So 
by the Cauchy-Schwarz inequality, Ui • Ui differs from 1 by at most 28^/s + 8 s. 
Second, for i ^ j, Ui • Uj = Y^l=i( u k + 7*)(w* + Vk)* where the 7 fc 's and n^s 
are error terms, and the argument proceeds analogously. ■ 

7.3 Searching for Quantum Algorithms 

In this section we use the results on almost-unitary matrices to construct an 
algorithm. First we need a lemma about error buildup in quantum algorithms, 
which is similar to Corollary 3.4.4 of |J] (though the proof technique is different). 

Lemma 7. Let U\ , . . . Ut be s x s unitary matrices, U\ , . . . Ut be s x s arbitrary 
matrices, and v be an s x 1 vector with \\v\\ 2 = 1- Suppose that, for all i, 

Ut — Ui < 1/cs, where c > T/2. Then U\ ■ ■ ■ Utv differs from U\ ■ ■ ■ Utv by at 

most 2T j [y/s (2c — T)] in the L 2 norm. 

Proof. For each i, let Ei = Ui — Ui. By hypothesis, every entry of Ei has magni- 
tude at most 1/cs; thus, each row or column w of E, has ||w|| 2 < 1/ (c->/s). Then 
U\- ■ ■ Utv = (U\ + Ei) ■ ■ ■ (Ut + Et) v. The right-hand side, when expanded, 
has 2 T terms. Any term containing k matrices E{ has L 2 norm at most s _1 / 2 c~ fc , 



and can therefore add at most c k I \fs to the discrepancy with XJ\ ■ ■ ■ Utv. So 
the total discrepancy is at most s~ 1 ' 2 '^2k=i( ) (1/c) < s" 1 / 2 (e T / c — l) . Since 
dlnt/dt evaluated at t = 2c is l/2c and since lni is concave, ln(2c + T) — Ln(2c — 
T) > 2T/2c = T/c when T < 2c. Therefore e T / c < (2c + T)/(2c - T) and the 
discrepancy is at most 2T j [s/s (2c — T)] in the L2 norm. ■ 

Applying Lemmas H, pi and [7l we now prove the main theorem. 

Theorem 3. There exists an approximation algorithm for SQ2, m (f) taking time 
20(4 ran) ^ w ^ approximation ratio v22/3 + e. 

Proof. Given /, we want, subject to the following two constraints, to find an 
algorithm T that approximates / with a minimum number of queries. First, T 
uses at most m qubits, meaning that s = 2 m and the relevant matrices are 2 m x 
2 m . Second, the correctness probability of T is known to a constant accuracy 
±e. Certainly the number T of queries never needs to be more than n, for, 
although each quantum algorithm is space-bounded, the composite algorithm 
need not be. Let A be the L meLX error we can tolerate in the matrices, and let A 
be the resultant L2 error in the final states. Setting c = 1/ (A2 m ), by Lemma 

we have A < 2nj \2 m / 2 (2 1-m /A — nj] . From the Cauchy-Schwarz inequality, 
one can show that e < 2A. Then solving for 1/A, 1/A < 2 m/2 n (2/e + 1) which, 
since e is constant, is (3(2 m / 2 n). Solving for c, we can verify that c > T/2, 
as required by Lemma J?]. If we generate almost-unitary matrices, they need to 
be within A of actual unitary matrices. By Lemma || we can use A/ (4.91-v/s)- 
almost-unitary matrices. Finally we need to ensure that we approximate every 
unitary matrix. Let <5 be the needed precision. Invoking Lemma @, we set 
A/ (4.91^/i) > 25y/s~+6 2 s and obtain that S < max [A/ (9.82s) , A 1 / 2 / (2.22s 3 / 4 )] 
is sufficient. 

Therefore the number of bits of precision needed per entry, log (1/(5), is O(m). 
We thus need only 0(A m mn) bits to specify T, and can search through all pos- 
sible T in time 2°( 4m,rm ). The amount of time needed to evaluate a composite 
algorithm T' is polynomial in m and n, and is absorbed into the exponent. The 
approximation algorithm is this: first let e > be a constant at most 0.0268, and 
let u) = ^ + |e— 8e 2 . Then find the smallest T such that the maximum probabil- 
ity of correctness over all T-query algorithms T' is at least 2/3 — e (subject to ±e 
uncertainty), and return Ty/uj. The algorithm achieves an approximation ratio 
of y/u;, for the following reason. First, T < SQ2,m(f)- Second, uT > SQ2,m(f), 
since by repeating the optimal algorithm T* until it returns the same answer 
twice (which takes either two or three repetitions), the correctness probability 
can be boosted above 2/3. Finally, a simple calculation reveals that T* returns 
the same answer twice after expected number of invocations lo. ■ 
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